Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Classification of DNA sequences using Bloom filters

Identifieur interne : 002663 ( Main/Exploration ); précédent : 002662; suivant : 002664

Classification of DNA sequences using Bloom filters

Auteurs : Henrik Stranneheim [Suède] ; Max K Ller [Suède] ; Tobias Allander [Suède] ; Björn Andersson [Suède] ; Lars Arvestad [Suède] ; Joakim Lundeberg [Suède]

Source :

RBID : ISTEX:9F1C2D341F92D49DE7C480EB2823B669D874C7BC

Abstract

Motivation: New generation sequencing technologies producing increasingly complex datasets demand new efficient and specialized sequence analysis algorithms. Often, it is only the ‘novel’ sequences in a complex dataset that are of interest and the superfluous sequences need to be removed. Results: A novel algorithm, fast and accurate classification of sequences (FACSs), is introduced that can accurately and rapidly classify sequences as belonging or not belonging to a reference sequence. FACS was first optimized and validated using a synthetic metagenome dataset. An experimental metagenome dataset was then used to show that FACS achieves comparable accuracy as BLAT and SSAHA2 but is at least 21 times faster in classifying sequences. Availability: Source code for FACS, Bloom filters and MetaSim dataset used is available at http://facs.biotech.kth.se. The Bloom::Faster 1.6 Perl module can be downloaded from CPAN at http://search.cpan.org/∼palvaro/Bloom-Faster-1.6/ Contacts: henrik.stranneheim@biotech.kth.se; joakiml@biotech.kth.se Supplementary information: Supplementary data are available at Bioinformatics online.

Url:
DOI: 10.1093/bioinformatics/btq230


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Classification of DNA sequences using Bloom filters</title>
<author>
<name sortKey="Stranneheim, Henrik" sort="Stranneheim, Henrik" uniqKey="Stranneheim H" first="Henrik" last="Stranneheim">Henrik Stranneheim</name>
</author>
<author>
<name sortKey="K Ller, Max" sort="K Ller, Max" uniqKey="K Ller M" first="Max" last="K Ller">Max K Ller</name>
</author>
<author>
<name sortKey="Allander, Tobias" sort="Allander, Tobias" uniqKey="Allander T" first="Tobias" last="Allander">Tobias Allander</name>
</author>
<author>
<name sortKey="Andersson, Bjorn" sort="Andersson, Bjorn" uniqKey="Andersson B" first="Björn" last="Andersson">Björn Andersson</name>
</author>
<author>
<name sortKey="Arvestad, Lars" sort="Arvestad, Lars" uniqKey="Arvestad L" first="Lars" last="Arvestad">Lars Arvestad</name>
</author>
<author>
<name sortKey="Lundeberg, Joakim" sort="Lundeberg, Joakim" uniqKey="Lundeberg J" first="Joakim" last="Lundeberg">Joakim Lundeberg</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:9F1C2D341F92D49DE7C480EB2823B669D874C7BC</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1093/bioinformatics/btq230</idno>
<idno type="url">https://api.istex.fr/ark:/67375/HXZ-XV8ZN2M3-9/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000109</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">000109</idno>
<idno type="wicri:Area/Istex/Curation">000109</idno>
<idno type="wicri:Area/Istex/Checkpoint">000534</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000534</idno>
<idno type="wicri:doubleKey">1367-4803:2010:Stranneheim H:classification:of:dna</idno>
<idno type="wicri:Area/Main/Merge">002688</idno>
<idno type="wicri:Area/Main/Curation">002663</idno>
<idno type="wicri:Area/Main/Exploration">002663</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main">Classification of DNA sequences using Bloom filters</title>
<author>
<name sortKey="Stranneheim, Henrik" sort="Stranneheim, Henrik" uniqKey="Stranneheim H" first="Henrik" last="Stranneheim">Henrik Stranneheim</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Suède</country>
<wicri:regionArea>Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm</wicri:regionArea>
<wicri:noRegion>106 91 Stockholm</wicri:noRegion>
</affiliation>
<affiliation></affiliation>
</author>
<author>
<name sortKey="K Ller, Max" sort="K Ller, Max" uniqKey="K Ller M" first="Max" last="K Ller">Max K Ller</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Suède</country>
<wicri:regionArea>Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm</wicri:regionArea>
<wicri:noRegion>106 91 Stockholm</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Allander, Tobias" sort="Allander, Tobias" uniqKey="Allander T" first="Tobias" last="Allander">Tobias Allander</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Suède</country>
<wicri:regionArea>Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm</wicri:regionArea>
<wicri:noRegion>106 91 Stockholm</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Andersson, Bjorn" sort="Andersson, Bjorn" uniqKey="Andersson B" first="Björn" last="Andersson">Björn Andersson</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Suède</country>
<wicri:regionArea>Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm</wicri:regionArea>
<wicri:noRegion>106 91 Stockholm</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Arvestad, Lars" sort="Arvestad, Lars" uniqKey="Arvestad L" first="Lars" last="Arvestad">Lars Arvestad</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Suède</country>
<wicri:regionArea>Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm</wicri:regionArea>
<wicri:noRegion>106 91 Stockholm</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Lundeberg, Joakim" sort="Lundeberg, Joakim" uniqKey="Lundeberg J" first="Joakim" last="Lundeberg">Joakim Lundeberg</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Suède</country>
<wicri:regionArea>Science for Life Laboratory, KTH Royal Institute of Technology, SE-100 44 Stockholm, LingVitae AB, Roslagstullsbacken 33, 114 21 Stockholm, Department of Microbiology, Laboratory for Clinical Microbiology, Tumor and Cell Biology, Karolinska University Hospital, Karolinska Institutet, SE-17176 Stockholm, Department of Cell and Molecular Biology, Karolinska Institutet, SE-17177 Stockholm and School of Computer Science and Communication, Stockholm Bioinformatics Center, AlbaNova University Center, Royal Institute of Technology, 106 91 Stockholm</wicri:regionArea>
<wicri:noRegion>106 91 Stockholm</wicri:noRegion>
</affiliation>
<affiliation></affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j" type="main">Bioinformatics</title>
<idno type="ISSN">1367-4803</idno>
<idno type="eISSN">1460-2059</idno>
<imprint>
<publisher>Oxford University Press</publisher>
<date type="published">2010</date>
<date type="e-published">2010</date>
<biblScope unit="vol">26</biblScope>
<biblScope unit="issue">13</biblScope>
<biblScope unit="page" from="1595">1595</biblScope>
<biblScope unit="page" to="1600">1600</biblScope>
</imprint>
<idno type="ISSN">1367-4803</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">1367-4803</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract">Motivation: New generation sequencing technologies producing increasingly complex datasets demand new efficient and specialized sequence analysis algorithms. Often, it is only the ‘novel’ sequences in a complex dataset that are of interest and the superfluous sequences need to be removed. Results: A novel algorithm, fast and accurate classification of sequences (FACSs), is introduced that can accurately and rapidly classify sequences as belonging or not belonging to a reference sequence. FACS was first optimized and validated using a synthetic metagenome dataset. An experimental metagenome dataset was then used to show that FACS achieves comparable accuracy as BLAT and SSAHA2 but is at least 21 times faster in classifying sequences. Availability: Source code for FACS, Bloom filters and MetaSim dataset used is available at http://facs.biotech.kth.se. The Bloom::Faster 1.6 Perl module can be downloaded from CPAN at http://search.cpan.org/∼palvaro/Bloom-Faster-1.6/ Contacts: henrik.stranneheim@biotech.kth.se; joakiml@biotech.kth.se Supplementary information: Supplementary data are available at Bioinformatics online.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Suède</li>
</country>
</list>
<tree>
<country name="Suède">
<noRegion>
<name sortKey="Stranneheim, Henrik" sort="Stranneheim, Henrik" uniqKey="Stranneheim H" first="Henrik" last="Stranneheim">Henrik Stranneheim</name>
</noRegion>
<name sortKey="Allander, Tobias" sort="Allander, Tobias" uniqKey="Allander T" first="Tobias" last="Allander">Tobias Allander</name>
<name sortKey="Andersson, Bjorn" sort="Andersson, Bjorn" uniqKey="Andersson B" first="Björn" last="Andersson">Björn Andersson</name>
<name sortKey="Arvestad, Lars" sort="Arvestad, Lars" uniqKey="Arvestad L" first="Lars" last="Arvestad">Lars Arvestad</name>
<name sortKey="K Ller, Max" sort="K Ller, Max" uniqKey="K Ller M" first="Max" last="K Ller">Max K Ller</name>
<name sortKey="Lundeberg, Joakim" sort="Lundeberg, Joakim" uniqKey="Lundeberg J" first="Joakim" last="Lundeberg">Joakim Lundeberg</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002663 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002663 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:9F1C2D341F92D49DE7C480EB2823B669D874C7BC
   |texte=   Classification of DNA sequences using Bloom filters
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021